-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zvol: allow rename of in use ZVOL dataset #8371
Conversation
169c1b9
to
865e1cf
Compare
Codecov Report
@@ Coverage Diff @@
## master #8371 +/- ##
==========================================
+ Coverage 78.43% 78.67% +0.24%
==========================================
Files 380 380
Lines 115901 115898 -3
==========================================
+ Hits 90909 91187 +278
+ Misses 24992 24711 -281
Continue to review full report at Codecov.
|
This change fixes the failed assertion and does not seem to be breaking anything (at least on the surface). We need to investigate if/when this change can become an issue (e.g during |
I tried renaming a ZVOL during an incremental This system is running code from this PR:
|
While ZFS allow renaming of in use ZVOLs at the DSL level without issues the ZVOL layer does not correctly update the renamed dataset if the device node is open (zv->zv_open_count > 0): trying to access the stale dataset name, for instance during a zfs receive, will cause the following failure: VERIFY3(zv->zv_objset->os_dsl_dataset->ds_owner == zv) failed ((null) == ffff8800dbb6fc00) PANIC at zvol.c:1255:zvol_resume() Showing stack for process 1390 CPU: 0 PID: 1390 Comm: zfs Tainted: P O 3.16.0-4-amd64 openzfs#1 Debian 3.16.51-3 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 0000000000000000 ffffffff8151ea00 ffffffffa0758a80 ffff88028aefba30 ffffffffa0417219 ffff880037179220 ffffffff00000030 ffff88028aefba40 ffff88028aefb9e0 2833594649524556 6f5f767a3e2d767a 6f3e2d7465736a62 Call Trace: [<0>] ? dump_stack+0x5d/0x78 [<0>] ? spl_panic+0xc9/0x110 [spl] [<0>] ? mutex_lock+0xe/0x2a [<0>] ? zfs_refcount_remove_many+0x1ad/0x250 [zfs] [<0>] ? rrw_exit+0xc8/0x2e0 [zfs] [<0>] ? mutex_lock+0xe/0x2a [<0>] ? dmu_objset_from_ds+0x9a/0x250 [zfs] [<0>] ? dmu_objset_hold_flags+0x71/0xc0 [zfs] [<0>] ? zvol_resume+0x178/0x280 [zfs] [<0>] ? zfs_ioc_recv_impl+0x88b/0xf80 [zfs] [<0>] ? zfs_refcount_remove_many+0x1ad/0x250 [zfs] [<0>] ? zfs_ioc_recv+0x1c2/0x2a0 [zfs] [<0>] ? dmu_buf_get_user+0x13/0x20 [zfs] [<0>] ? __alloc_pages_nodemask+0x166/0xb50 [<0>] ? zfsdev_ioctl+0x896/0x9c0 [zfs] [<0>] ? handle_mm_fault+0x464/0x1140 [<0>] ? do_vfs_ioctl+0x2cf/0x4b0 [<0>] ? __do_page_fault+0x177/0x410 [<0>] ? SyS_ioctl+0x81/0xa0 [<0>] ? async_page_fault+0x28/0x30 [<0>] ? system_call_fast_compare_end+0x10/0x15 Signed-off-by: loli10K <[email protected]>
if (zv->zv_open_count > 0) { | ||
mutex_exit(&zv->zv_state_lock); | ||
continue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the comment in #4344 (comment) this logic was added to make sure the zv_name
wasn't updated while being accessed by zvol_ioctl()
. This was a concern because it was accessed outside the zv_state_lock
. Since this is no longer the case, this should be fine. This does mean the udev symlink name may change out from underneath us, which is arguably what we want to happen.
Function zvol_rename_minors() does not check the reference count before
modifying the zv_name, which is not good since zvol_ioctl(), for instance,
accesses zv_name without holding zvol_state_lock (but with zv_open_count
being positive).
@bprotopopov would you mind commenting on this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks good to me. It definitely fixes the problem and the added test case is great. It also makes the behavior more consistent with how filesystems work.
There is an issue here however. Although this code helps to ensure that the zvol name matches the dataset name, the root cause here seems to be that the zvol suspending / resuming code relies on this to be true, when there actually are several ways it can get out of sync. At least a few cases where I believe this still might occur are:
- If the
zvol_inhibit_dev
tunable is set, this renaming code won't run. zvol_rename_minors_impl()
is called fromzvol_task_cb()
which is run via taskqs in several places. None of these callers seem to really care if thetaskq_dispatch()
was successful or not.- The zvol renaming code all runs asynchronously via taskqs. I don't have a specific example, but I believe there might be race conditions between suspending and the renaming tasks running, especially since
zvol_suspend()
drops thezv_state_lock
before expectingzvol_resume()
to be called.
I think we can solve this by changing zvol_suspend()
so that it takes an objset instead of a name. I think we can then get the zvol via the dataset's owner. By doing that we remove the dependency on the zvol name matching the dataset name.
I have a feeling this problem might unfortunately be a lot more systemic and a lot more than @loli10K originally wanted to bite off. Maybe we need a separate PR for this, but I think we really need to rethink how the code relies on the zvol name and all of the callers of zvol_find_by_name()
.
@behlendorf its up to you what we should do with the work that's been done here. I'm willing to approve it if we want to save this work for another PR.
@tcaputi I agree there's a larger issue to tackle here regarding lookups by zvol name. Doing this with the objset instead of the name as you suggested is definitely worth exploring as a more complete solution. I'd like to move forward with this PR as is, which resolves a real existing issue, and then we can follow up with a more comprehensive improvement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (other than style)
Sorry was away from computer on vacation
Typos courtesy of my iPhone
On Feb 19, 2019, at 4:34 PM, Brian Behlendorf <[email protected]<mailto:[email protected]>> wrote:
@bprotopopov<https://github.com/bprotopopov> would you mind commenting on this PR.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#8371 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ACX4ud6V9mYdS4QyZn4D9KRwejzQY0viks5vPG3EgaJpZM4agQZ4>.
|
Motivation and Context
Fix #6263
Description
NOTE: this is a tentative fix.
While ZFS allow renaming of in use ZVOLs at the DSL level without issues the ZVOL layer does not correctly update the renamed dataset if the device node is open (zv->zv_open_count > 0): trying to access the stale dataset name, for instance during a zfs receive, will cause the following failure:
How Has This Been Tested?
Tested with reproducer added to the ZFS Test Suite.
Types of changes
Checklist:
Signed-off-by
.